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Abstract.  Ten  years  ago  Hopcroft  and  Tarjan  discovered  a  class  of  very  fast 
algorithms  for  solving  graph  problems  such  as  biconnectivity  and  strong  connec¬ 
tivity.  While  these  depth- first- search  algorithms  arc  complex  and  can  be  difficult 
to  understand,  the  problems  they  solve  have  simple  combinatorial  definitions  that 
can  themselves  be  considered  algorithms,  though  they  might  be  very  inefficient  or 
even  infinitary.  We  demonstrate  here  how  the  efficient  algorithms  can  be  systemati¬ 
cally  derived  using  program  transformation  steps  from  the  intuitive  but  preliminary 
definitions. 

There  are  several  justifications  for  this  work.  First,  we  believe  that  the  evolution- 
ary  approach  used  in  this  paper  offers  more  natural  explanations  of  the  algorithms 
than  the  usual  a  posteriori  proofs  that  appear  in  textbooks.  Second,  the  derivations 
illustrate  several  high-level  principles  of  program  derivation  and  suggest  methods  by 
which  these  principles  can  be  realised  by  sequences  of  program  transformation  steps. 
Third,  these  examples  illustrate  how  external  domain-specific  knowledge  can  enter 
into  the  program  derivation  process.  This  is  the  first  occasion  that  such  efficient 
graph  algorithms  have  been  systematically  derived. 
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Abstract.  Ten  years  ago  Hopcroft  and  Tarjan  discovered  a  class  of  very  fast 
algorithms  for  solving  graph  problems  such  as  biconnectivity  and  strong  connec¬ 
tivity.  While  these  depth-first-search  algorithms  are  complex  and  can  be  difficult 
to  understand,  the  problems  they  solve  have  simple  combinatorial  definitions  that 
can  themselves  be  considered  algorithms,  though  they  might  be  very  inefficient  or 
even  mfinitary.  We  demonstrate  here  how  the  efficient  algorithms  can  be  systemati¬ 
cally  derived  using  program  transformation  steps  from  the  intuitive  but  preliminary 
definitions. 

There  are  several  justifications  for  this  work.  First,  we  believe  that  the  evolution¬ 
ary  approach  used  in  this  paper  ofTers  more  natural  explanations  of  the  algorithms 
than  the  usual  a  posteriori  proofs  that  appear  in  textbooks.  Second,  the  derivations 
illustrate  several  high-level  principles  of  program  derivation  and  suggest  methods  by 
which  these  principles  can  be  realised  by  sequences  of  program  transformation  steps. 
Third,  these  examples  illustrate  how  external  domain-specific  knowledge  can  enter 
into  the  program  derivation  process.  This  is  the  first  occasion  that  such  efficient 
graph  algorithms  have  been  systematically  derived. 


1.  Introduction. 

Discovery  of  efficient  algorithms  is  a  complex  and  undisciplined  task,  requiring  sophisticated 
knowledge  of  both  general-purpose  algorithm  design  techniques  and  special-purpose  mathematical 
facts  related  to  the  problems  being  solved.  While  the  process  of  algorithm  discovery  is  certain  to  be 
exceedingly  difficult  to  mechanize,  there  is  much  to  be  learned — both  about  algorithms  and  about 
programming — from  the  study  of  the  structure  of  derivations  of  complex  algorithms. 

In  this  paper  we  demonstrate  how  program  transformation  techniques  can  be  used  to  derive 
efficient  graph  algorithms  from  intuitive  specifications.  These  specifications  are  simple  combina¬ 
torial  definitions  that  wc  choose  to  interpret  as  algorithms,  even  though — as  algorithms — they 
might  be  very  inefficient  or  even  infinitary. 

The  derivations  suggest  ways  in  which  algorithm-design  knowledge  separates  from  domain- 
specific  knowledge.  While  the  depth-first  algorithms  we  derive  depend  on  deep  combinatorial 
properties  of  depth-first  spanning  forests,  the  algorithms  can  nonetheless  be  derived  using  only 
general- purpose  program  derivation  techniques — supported  by  the  necessary  combinatorial  lemmas. 
Indeed,  we  are  optimistic  that  this  sort  of  separation  can  be  achieved  in  general,  and  surveys  such 
as  |Tarjan77]  appear  to  support  this  possibility.  If  this  proves  possible,  then  the  program  derivation 
techniques  refined  and  applied  here  and  elsewhere  can  ultimately  be  of  use  in  practical  mechanical 
programming  aids— aids  designed  primarily  for  the  programmer,  not  the  algorithm  designer. 
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Program  derivation  technique?  also  provide  a  natural  way  of  explaining  complicated  algo¬ 
rithms.  Conventional  proofs  may  succeed  in  convincing  a  reader  of  the  correctness  of  an  algorithm 
without  supplying  any  hint  of  why  the  algorithm  works  or  how  it  came  about.  A  derivation, 
on  the  other  hand,  is  analogous  to  a  constructive  proof;  it  takes  a  reader  step  by  step  from  an 
initial  algorithm  he  accepts  as  a  specification  of  the  problem  to  a  highly  connected  and  efficient 
implementation  of  it.  As  in  (Clark80],  we  are  deriving  a  family  of  related  algorithms.'  Even  though 
the  algorithms  we  derive  here  do  not  all  have  the  same  specifications,  the  strong  relations  between 
them  become  manifest  in  the  explicit  structure  of  their  derivations. 

In  Section  2  of  this  paper  we  derive  a  family  of  depth-first  search  algorithms.  These  are 
generalised  and  utilized  in  quite  different  ways  in  the  biconnectivity  algorithms  of  Section  3  and 
in  the  strong-connectivity  algorithms  of  Section  4.  These  algorithms  were  discovered  by  Hopcroft 
and  Tarjan  and  are  (conventionally)  presented  in  [Tarjan?2]  and  [AHU74].  The  variant  of  Tarjan’s 
strong-connectivity  algorithm  that  we  derive  in  Section  4  is  attributed  to  Kosaraju.  (We  can  also 
apply  similar  techniques  to  derive  the  almost-linear-time  algorithm  of  [Tarjan73]  for  flow-graph 
reducibility.)  In  the  conclusion  we  discuss  further  the  implications  of  this  work. 

Because  we  seek  to  demonstrate  how  derivations,  clearly  presented,  can  lead  to  a  better 
understanding  of  the  algorithms  derived,  the  emphasis  in  this  paper  is  primarily  on  the  conceptual 
structure  of  the  derivations  and  only  secondarily  on  the  actual  formal  transformation  techniques. 
Indeed,  most  of  the  transformation  techniques  we  use  have  appeared  elsewhere,  though  perhaps 
in  other  forms.  We  make  use  of  transformations  for  realizing  complex  recursive  control  structure 
as  explicit  data  structure  similar  to  those  described  in  |Bird80],  [Scherlis80],  and  [Wand80].  In 
addition,  we  make  implicit  use  of  transformations  such  as  those  described  in  [Burstall77]  or 
[Scherlis81]  to  effect  the  merging  or  “jamming”  of  loops  and  to  specialize  function  definitions. 
Discussion  of  loop  jamming  techniques  also  appears  in  [Paige81]. 

We  do  not,  in  this  summary,  specify  precisely  the  programming  language  we  use,  except  to 
say  that  it  is  a  straightforward  LISP-like  (or  ML-like)  applicative  language  supplemented  with 
assignment  to  variables  and  modification  of  data  structures.  For  the  sake  of  clarity  of  derivations, 
it  is  important  that  the  programming  language  not  be  overly  constrained.  In  particular,  certain 
features  that  are  difficult  to  implement  but  which  have  clear  semantics  often  allow  derivations 
to  be  quite  straightforward.  This  is  vividly  illustrated  in  the  case  of  the  SETL  language  in  the 
derivations  of  |Paige8l].  Another  example  is  the  language  used  in  [Scherlis8l],  which  was  extended 
(to  include  expression  procedures — used,  for  example,  in  Algorithm  3.1  below)  in  order  to  keep  the 
set  of  transformations  simple  and  yet  strong-equivalence  preserving. 


2.  Depth-First  Search. 

We  consider  first  the  case  of  undirected  graphs.  Let  G  =  ( V,E )  be  an  undirected  graph  with 
adjacency  list  representation;  for  each  v  €  V,  Adj{v)  is  a  canonically  ordered  list  of  edges  connected 
to  v.  Observe  that  v  €  Adj[u)  if  and  only  if  u  £  Adjfv). 

Paths.  As  our  starting  point,  we  take  the  combinatorial  definition  of  a  path  in  a  graph.  Let 
v  and  v  range  over  vertices. 


path{ u,  v)  *=  (u  =  v  or 


(3w  G  Arf^(u))pam(w,v)) 
2 


(2.0) 


This  definition,  considered  as  an  algorithm,  has  potentially  infinite  execution  paths.  Suitable 
semantics  for  the  or  operator  would  allow  this  algorithm  to  terminate  correctly  whenever  there  is  a 
graph  path,  but  for  many  graphs  and  vertex  pairs  this  “algorithm’'  has  no  finite  execution  paths. 

We  can,  however,  distinguish  two  kinds  of  infinite  execution  paths — looping  paths  and  diver¬ 
gent  paths.  Roughly,  a  nonterminating  path  is  a  looping  path  when  only  finitely-many  distinct 
recursive  calls  are  made  along  that  path;  if  the  number  of  distinct  calls  grows  without  bound,  then 
the  path  is  divergent.  In  the  case  of  finite  graphs  (the  only  graphs  we  will  consider)  Algorithm  2.0 
can  exhibit  looping,  but,  because  u  and  v  are  vertices  and  the  set  of  vertices  is  finite,  it  cannot 
exhibit  divergence. 

By  a  semantic  sleight  of  hand,  looping  evaluations  can  be  replaced  with  finite  ones.  In  the  case 
of  Algorithm  2.0,  it  is  consistent  with  our  interpretation  to  replace  all  looping  paths  with  false. 
We  effect  this  change  by  introducing  explicit  data  structure  to  mark  nodes  as  they  are  visited;  by 
examining  this  data  structure,  the  program  can  foreclose  any  potentially  looping  execution  paths. 
(Transformations  for  carrying  out  this  kind  of  change  are  sketched  in  [Scherlis80]  and  are  related 
to  the  closed-world  database  techniques  described  in  [Clark78]  and  [Reiter?8].) 

The  transformation  has  two  steps.  First,  we  observe  that  the  second  parameter  of  path  never 
changes  and  so  can  be  made  free.  This  reduces  the  number  of  possible  recursive  calls  and  hence 
the  amount  of  data  structure  required. 

path(u,  v)  «=  vpath{ u) 

where  (2.1) 

vpath(v )  4=  ( ti  =  v  or  (3to  6  Adftu))  vpath(w) ) 

Next,  we  introduce  a  boolean  anay,  visif(V’),  initially  false  for  each  vertex  v  6  V. 

path(u,  v)  «=  begin  visit(V)  «-  false;  vpath( u)  end 

where 

vpath(u)  *=  (2.2) 

if  vislt(u)  then  false 

else  begin  visit(v)  —  true;  (  u  =  v  or  (3w  €  Adj[u))  vpath(w) )  end 

(The  value  of  a  block  is  the  value  of  the  last  expression  unless  some  other  expression  is  marked  by 
the  word  value,  in  which  case  the  value  of  that  expression  is  saved  when  it  is  evaluated  and  returned 
after  evaluation  of  the  remainder  of  the  block  is  complete;  by  convention  imperative  statements 
are  always  enclosed  in  blocks.)  This  imperative  program  terminates  for  all  finite  graphs.  Note  that 
the  same  effect  could  be  acheived  in  a  purely  applicative  framework  by  adding  another  parameter 
(representing  a  continuation),  but  the  resulting  program  would  be  less  clear  for  our  purposes. 

The  finite  depth-first  search  algorithm  is  obtained  by  rotating  the  initial  visit  test  from  callee 
to  caller  and  by  writing  vpath  as  a  function,  dfs,  defined  such  that 

v  €  dfs(u)  <=»  path(u,v) ,  i.e.,  dfs{ u)  =  {v  |  palh(u,v)}  . 

The  function  dis  will  thus  precompute  the  set  of  possible  paths  from  a  given  vertex — the  connected 
component  associated  with  that  vertex.  (Again,  we  assume  visil(V)  is  initially  false.) 

'  dts(  v)  *= 
begin 

visit(v) «-  true;  (2.3) 

{«}  U  Uiu€>»dj(*)Of  visit(v))  then  p  else  dts{vi)) 
end 
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This  function  can  easily  be  derived  by  specializing  the  definition  of  path  to  the  set  computation 
specified  above  using  the  techniques  of  (SchcrlisSl].  (A  more  detailed  example  of  specialization 
appears  in  Section  3  below.) 

Connected  Components.  The  set  of  connected  components 

comps(V)  «=  Ur€v  (  begin  ws/f(V)  ♦-  false;  dts(r)  end  } 
can  be  quickly  computed  by  making  use  of  the  visit  array. 

comps(V)  <=  begin  visit(V)  «-  false;  (Jr6v(*^  W5,,(r)  then  0  else  (d/s(r)})  end  (2.4) 

(This  is  derived  by  using  the  applicative  representation  for  the  visit  array  and  noting  redundant 
components.)  This  program  requires  0(|V|  |-E|)  time. 

Depth-First  Spanning  Forests.  The  fast  depth-first  search  algorithms  are  based  on  subtle 
combinatorial  properties  of  the  depth-first  spanning  forests  implicit  in  the  prior  algorithms.  In  the 
case  of  undirected  graphs,  the  depth-first  search  divides  the  edges  of  a  graph  into  two  sets,  tree 
edges,  the  edges  actually  traversed  during  search,  and  the  other  edges,  which  are  called  fronds.  We 
use  the  notation  u  — ►  v  to  indicate  tree  edges  and  u-nuto  indicate  fronds. 

We  will  occasionally  need  to  distinguish  the  fronds  explicitly  during  search.  With  respect  to 
Algorithm  2.3,  we  observe  that  the  fronds  are  exactly  those  edges  (u,  w)  tor  which  the  visit(xv)  test 
is  true  but  (since  the  graph  is  undirected)  such  that  w  is  not  the  father  of  u  in  the  Bearch  tree. 

dts(u)  «= 

begin 

visit{ u)  *-  true; 

{«)  U  U*>€Ad)(u)(*f  viSit(to)  (2.5) 

then  (if  w  ^  father(u)  then  awertju  ■+♦  to]);  0 
else  assert  [u  -*v  A  father(w)  =  «];  dts(w)) 

end 

Here  we  have  decorated  Algorithm  2.3  with  assertions  distinguishing  the  two  sets  of  edges.  The 
father  function,  which  is  defined  implicitly  by  the  assertion,  can  be  realized  explicitly  using  data 
structure,  in  effect  transforming  the  assertion  into  an  assignment.  This  requires  a  very  simple  proof 
by  induction  that  the  instance  of  father{u)  in  the  test  will  have  been  previously  assigned  a  value. 
(In  the  case  of  a  root,  father  can  return  a  special  value,  say  A,  that  will  cause  the  test  to  fail.) 

Using  the  specialization  techniques  mentioned  above,  we  can  eliminate  all  references  to  the 
father  function/array  by  introducing  a  new  parameter  to  dfs  that  will  be  the  father  of  u  in  the 
depth-first  search  tree.  This  requires  a  slight  modification  of  the  definition  of  comps.  (For  esthetic 
reasons  we  also  reorient  the  nested  conditionals.) 

comps(V)  «=  (Jr€v(*f  visit(r)  then  0  else  {dls(r, A))) 

<ffs(u,v)  *= 

begin 

visit(u)  *-  true;  (2.6) 

{«}  U  UwgyidjMOf  ->Wsft(w)  then  assert|u  -*  w];  dfs{w,  u) 
elseif  w  v  then  usert[u  -**  tv];  0 

else  0) 

end 

Similar  derivations  can  be  carried  out  in  the  case  of  directed  graphs;  the  resulting  algorithms 
arc  simpler  since  the  father  tests  (and  associated  parameters)  arc  not  needed. 
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Tree  Orderings.  All  of  the  lemmas  on  which  the  depth-first-search  algorithms  arc  based  make 
use  of  “non-local”  properties  of  depth-first  starch  trees;  that  is,  they  require  testing  relations 
between  vertices  that  may  be  an  arbitrary  distance  apart  in  the  trees.  In  particular,  both  the 
biconnectivity  and  strong  connectivity  algorithms  are  based  on  lemmas  that  make  use  of  ancestor 
or  descendent  orderings  in  the  search  forest.  We  make  derivation  steps  here  that  will  enable  these 
relations  to  be  precomputed  entirely  in  the  course  of  a  single  depth-first-search  pass.  (We  now 
expand  our  discussion  to  include  directed  graphs  as  well  as  undirected  graphs.) 

The  descendent  ordering  is  the  transitive  closure  of  the  ordering  represented  by  the  edges  of 
the  depth-first  search  forest.  That  is, 

v  y  v  if  and  only  if  there  is  a  path  of  tree  edges  from  utov. 

We  can  introduce  this  ordering  into  the  dfs  program  simply  by  adding  an  appropriate  assertion, 
as  we  did  in,the  case  of  father.  After  the  entire  graph  has  been  traversed  by  dfs,  the  descendent 
ordering  will  be  the  transitive  closure  of  the  asserted  relation,  >-.  An  easy  induction  proof  can 
establish  that  no  contradictory  relations  are  asserted.  (We  have  temporarily  eliminated  the  father 
parameter  in  order  to  extend  the  applicability  of  this  algorithm  to  directed  graphs.) 

dfs(u)  <= 
begin 

visit(u) «-  true;  (2  7) 

{«}  U  ~'visit(w)  then  a»sert[u>  >-  «];  dfs(w,  u)  else  #) 

end 

We  seek  linear-time  algorithms,  so  we  will  not  be  able  to  accept  a  naive  implementation  of  this 
algorithm — computation  of  the  transitive  closure  alone  would  typically  require  0(V3)  time.  We 
must  therefore  continue  the  derivation  process  and  make  use  of  further  combinatorial  properties. 
Since  we  are  concerned  with  descendent  orderings  in  trees,  it  is  natural  to  consider  introducing 
explicit  pre-  and  endorder  relations  represented  by  numberings.  Both  numberings  can  be  computed 
in  linear  time  in  a  single  tree  traversal  and,  in  combination,  determine  the  descendent  ordering. 

LEMMA  2.1.  Let  T  be  a  tree  with  vertices  numbered  in  preorder  and  endorder  (in  arrays  pre(V) 
and  end(V)).  Then 

u  y  v  if  and  only  if  pre(u)  >  pre(v)  and 

end(u)  <  end[v) . 

That  is,  u  is  a  proper  descendent  of  v  if  and  only  if  both  u  succeeds  v  in  the  preorder  numbering 
and  u  precedes  v  in  the  endorder  numbering. 

Preorder  and  endorder  numbers  in  the  depth-first  search  forest  can  be  assigned  through  the 
use  of  assignable  free  variables  (that  we  will  call  p  and  e)  in  the  dfs  procedure.  For  brevity,  we  omit 
intermediate  derivation  steps  that  lead  to  the  following  imperative  algorithm  (on  either  directed 
or  undirected  graphs)  for  simultaneously  calculating  the  two  numberings, 
p  *-  0;  e  -  0; 
dfs(u)  *= 
begin 

Wsft(u) «-  true;  (2 

pre(u)«-  p-  p  +  1; 

value  (  (u)  U  lLe>tiO(«)0f  "*'*#( w) thcn  (ffs{w,  u)  else  0)  ); 
end(u)  ♦-  e  *-  e  +  1 

end 
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(The  value  notation  is  explained  in  the  remark  following  Algorithm  2.2.)  If  pre(V')  is  initially  zero 
then  tb"  visit  array  can  be  eliminated  by  replacing  the  if  tost  with  the  test  prc(w)  =  0;  we  do  this 
below. 

Certain  ancestry  tests  do  not  require  use  of  both  of  the  numberings.  In  particular,  if  it  is  known 
that  two  vertices  are  related  by  the  descendent  relation,  but  it  is  not  known  in  which  direction  they 
are  related,  then  (by  a  simple  corollary  of  the  Lemina  above)  either  preorder  or  endorder  suffices. 
Since  the  preorder  numbering  is  also  useful  as  a  replacement  for  the  visit  array,  we  choose  it.  (The 
resulting  algorithm  will  be  applied  in  the  next  section  to  undirected  graphs,  so  we  reintroduce  the 
father  parameter  and  corresponding  assertions.) 

p*-  0\  prt(V)  «-  0 

dls(u,  v)  <= 

*  begin 

pre(u)  —  p  — p  +  1;  (2.9) 

{«}  U  UweA0j(«)O*  Pf*(vi)  =  0  then  a«ert[u  -*  tu];  dls(w,  u) 
el.eif  io  v  then  assertju  -+» to];  0 
else  0) 
end 

S.  Computing  Biconnected  Components. 

Let  G  —  (V,E)  be  an  undirected  connected  graph.  An  articulation  point  is  a  vertex  whose 
removal  disconnects  G.  A  graph  is  biconnected  if  it  has  no  articulation  point.  A  biconnected 
component  C  is  a  maximal  set  of  edges  that  contains  no  vertex  whose  removal  disconnects  the 
vertices  contained  in  the  edges  of  C. 

Our  derivation  of  the  linear-time  algorithm  for  detecting  biconnected  components  is  based  on 
a  technical  lemma  characterizing  articulation  points.  Once  the  articulation  points  are  found  then 
the  biconnected  components  can  be  collected  in  a  single  depth-first  traversal.  Assume  that  the 
depth-first  search  algorithm  has  been  performed  on  a  graph  to  obtain  a  depth-first  search  tree. 

Lemma  3.1.  (Hopcroft-Tarjan)  A  vertex  a  of  a  depth-first  search  tree  is  an  articulation  point  if  (1) 
it  is  the  root  and  has  more  than  one  son,  or  if  (2)  there  is  a  son  u  of  a  none  of  whose  descendents 
have  a  frond  to  a  proper  ancestor  of  a.  That  is,  if  a  is  not  the  root,  then  it  is  an  articulation  point 
exactly  if  it  has  a  son  u  such  that  Low( u)  y  a,  where 

Low{  u)  =  min>.(Lowsef(u)) 
and  Lowsef(u)  =  {«}  U  {x  |  «  ^  u  A  «  +»  *}. 

Note  that  the  elements  of  Lowset(u)  are  always  comparable  under  so  we  may  use  the 
Lemma  2.1  to  replace  our  explicit  use  of  the  ordering  by  tests  on  pre  values,  as  in  Algorithm 
2.9. 

We  first  derive  an  algorithm  for  detecting  articulation  points  and  then  modify  it  to  enable 
the  biconnected  components  to  be  collected  on  the  fly.  Articulation  point  detection  is  achieved  by 
specializing  the  dfs  algorithm  in  three  stages — computation  of  Lowset,  computation  of  Low,  and 
assertion  of  articulation  ports. 
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Lowset  Computation.  Since  s  >  u  if  and  only  if  s  €  dfs(u),  we  see  that  Lowset(u)  is  equivalent 
to  (x  j  b  €  dfs(u)  A  8  -r*  x}.  We  derive  a  program  to  compute  the  value  of  this  expression  by 
expanding  the  definition  of  dls  given  by  Algorithm  2.9.  This  will  require  introduction  of  the  father 
parameter  to  dfs  in  the  set  specification.  Direct  substitution  for  dfs  and  preliminary  simplification 
yield 

{x  |  «  €  dls(u,  v)  A  a  ■+♦  x}  <= 
begin 

pre(u)  «-  p  «-  p  +  1; 

{x  |  a  €  {«}  A  a  -h  x}  /j 

u  {x  I  a  e  PTt(w)  =  0  then  assert [«  -*  i»J;  dfs{w,  u) 

elseif  w  =£  v  then  assert [u  ■+♦  tuj;  0 
else 0)  Aj-hi} 

end . 

(For  conciseness,  we  omit  the  initialization  of  p  and  pre.)  Distribution  of  the  set  abstraction  into 
the  union  and  conditional  and  simplification  give 

{x  |  a  €  dfs(u,  t>)  A  a  ■+♦  x}  *= 
begin 

pre(u)i-p«-p+l; 

(x  |  u  -*♦  x) 

U  U«,e>u3(u)(tf  PTe(w)  =  0  then  assert[u  -♦  u>);  (3.2) 

(x  |  s  E  dts[w,  v)  A  »  -**  x} 
elseif  wj^v  then  assert[u  -+♦  «»];  0 
else  0) 

end . 

We  can  now  form  a  recursion;  this  is  done  by  replacing  both  instances  of  the  dfs  set  abstraction 
with  a  name,  Lowset( u,  v)  (equivalent  to  the  old  Lowset  with  a  father  parameter  added).  Further, 
since  all  the  fronds  for  tt  are  computed  inside  the  loop  we  decide  to  commute  the  outer  union, 
postponing  calculation  of  {x  |  u  +»  x}  until  the  fronds  have  been  detected. 

Lowsetfu,  v)  *= 
begin 

pre(u)  —  p  «-  p  +  1; 

lLeA4(«)0r  Prt(vfi  =  0  then  -»  w):  Lowsetfw,  u)  ,3  ^ 

elseif  w  v  then  assertlu  ■+♦  u>l;  0 
else  0) 

U  {x|u-Hl} 

end 

Now  since  {x  |  u  +*  x}  is  equivalent  to 

U«,eAdj(*)(*r  (u  -*♦  w)  th*n  {W  el»«  0) , 

we  substitute  this  expression  into  Algorithm  3.3,  merge  the  unions,  and  simplify  on  the  basis  of 
the  assertions  to  get  the  final  Lowset  program. 

Lowset(u,  v) 

begin 

pre(u) «-  p «-  p  +  1; 

Uwe/MX*)^  pre(w)  =  0  t,,e,,  Lowset(w,  u)  (3.4) 

elseif  w  v  then  {w} 
else  0) 

end 

(The  assertions  have  been  dropped  to  avoid  clutter.) 


Low  Computation.  A  similar  specialization  sequence  is  now  used  to  transform  this  algorithm 
into  a  program  for  Low(u,v),  defined  as  min>.({u}  U  Lowset(u,v )).  where  v  is  the  father  of  u. 

Low(u,  v)  «= 
begin 

pre(tt) «-  p  -  p  +  1; 

minfu.min^g^^if  pre(to)  =  0  then  Low(it>,u))  (3.5) 

elieif  w  v  then  w 
else  oo ) 

end 

(Here  oo  denotes  a  maximal  vertex  value;  note  that  u  would  do.)  We  obtained  this  program  by 
expanding  the  definition  of  Lowset  in  the  definition  of  Low,  simplifying,  and  renaming. 

Articulation  Point  Detection.  We  are  now  ready  to  locate  articulation  points.  Excepting  the 
root,  the  articulation  points  can  be  located  simply  by  inspecting  Low  values  in  the  depth-first-searcb 
tree.  That  is,  u  is  an  articulation  point  if  u  — ►  w  and  Low[w,  u)  y  it.  We  specialize  as  before,  but 
this  time  the  effect  is  simply  to  introduce  the  appropriate  assertion  into  the  search  algorithm. 

Low( u,  v)  <= 
begin 

pre(u)  *-p<-p+  1; 

min(it,  minu,€ylrfj(u)(if  pre(w)  =  0 

then  begin  if  t  y  u  then  ai»ert[Art(u)];  l  end  (3.6) 

where  t  =  Low(w,  u) 
elieif  w  7^  v  then  w 
else  oo) 

end 

(Just  as  with  if  Art  is  not  asserted  for  a  particular  vertex  then  that  vertex  is  not  an  articulation 
point.) 

As  a  final  step,  we  make  the  min  calculation  explicit  by  introducing  a  new  local  variable  in 
Low.  At  the  same  time,  we  replace  instances  of  ‘  y ’  with  comparisons  of  pre  values  (following  our 
earlier  remark).  The  values  of  Low  are  thus  now  integers  rather  than  vertices;  our  main  interest, 
however,  has  become  the  assertions  regarding  articulation  points. 

p  ♦-  0;  pre(V) «-  0;  Low(r,  A); 

Low{u,  v)  *= 
begin  vmr  m; 

m  —  pre(u)  —  p  «-  p  +  1; 
for  w  €  Adjfy)  do 

if  prt(w)  =  0  l'  1 

then  begin  if  l  >  u  then  aiscrt[Art(u)];  t  end 
where  l  =  Low(w,u) 
elieif  w  ^  v  then  m  min(m,  w); 
m 
end 


Biconnectcd  Components.  The  following  lemma  will  enable  collection  of  biconnectcd  com¬ 
ponents  to  be  performed  simultaneously  with  the  detection  of  articulation  points.  As  before,  assume 
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that  depth-first  search  has  been  performed  on  a  givers  graph.  Let  C  be  a  biconnccted  component 
and  let  Vc  be  the  set  of  vertices  contained  in  the  edges,  of  C. 

LEMMA  3.2.  (llopcroft  and  Tarjan)  For  every  biconnected  component  C ,  there  is  a  unique  vertex 
a  €  Vc  such  that  v  y  a  for  all  v  £  Vc  and  a  is  the  root  of  the  search  tree  or  an  articulation  point. 

//  For  brevity,  the  remainder  of  this  section  has  been  omitted.  It  will  appear  in  the  final  paper.  ]] 
4.  Strongly- Connected  Components. 

In  this  section  we  derive  an  interesting  linear  time  algorithm  for  strong  connectivity  attributed 
to  Kosaraju.  Let  us  return  to  the  original  definition  of  path.  Let  G  =  (V ,E)  be  a  directed  graph 
and  let  u  and  v  range  over  the  vertices  V . 

path(u,v)  <=  (u  =  v  or  (3w  £  Adj(u))  pathfw,  v) )  (4.i) 

The  path  relation  holds  between  u  and  v  just  when  there  is  a  directed  path  in  G  from  u  to  v.  Since 
we  are  dealing  with  directed  graphs,  it  will  be  helpful  to  test  reverse  paths  as  well. 

revpath(u,v )  *=  (  u  =  v  or  (3io  6  Ad]ml(u))  revpath(w,  v) )  (4.2) 

Clearly,  path[u,  v)  if  and  only  if  revpath(v,  u). 

Two  vertices  u  and  v  in  a  graph  are  strongly  connected  if  path[u,  v)  and  revpath(u,  v)  both 
hold.  A  maximal  set  of  strongly  connected  vertices  is  called  a  strongly  connected  component.  To 
find  the  strongly  connected  component  associated  with  a  particular  vertex  r,  if  suffices  to  collect 
all  vertices  u  reachable  from  r  such  that  path(u,  r).  For,  if  both  v  and  w  have  this  property  with 
respect  to  r,  then  by  transitivity  of  the  path  relation,  they  are  themselves  strongly  connected.  This 
implies  that  strongly  connected  components  are  disjoint. 
strong(V)  <=  Ur€v  {«c(r,  r)} 

(A  Qj 

sc(u,r)  «=  {u}  U  Uwe*dj(«)( & Path{w,r)  then  sc(ti;,r)  ) 

(Note  that  sc  is  not  yet  a  terminating  program.) 

The  trick  in  this  derivation  comes  from  the  observation  that  the  second  parameter  of  the  path 
relation  remains  constant  on  all  recursive  calls  of  sc  for  a  particular  root.  This  suggests  that  we 
should  be  able  to  do  a  single  depth-first  traversal  from  r  and  use  the  orderings  defined  in  Section 
2  to  test  ancestry. 

There  are  two  ways  we  can  obtain  this  advantage.  First,  we  could  use  revpath  instead  of  path, 
and  compute  ancestry  using  its  depth-first  search  tree  (since  the  dfs  realizations  of  path  and  revpath 
do  recursion  on  the  first  parameter).  Alternatively,  we  could  reverse  the  direction  of  the  search  in 
sc  above  (using  Ad/1  instead  of  Adj),  causing  the  path  test  parameters  to  be  reversed,  and  thus 
use  the  path  search  tree.  In  either  case,  we  will  need  to  traverse  the  graph  in  both  the  forward  and 
backward  directions. 

The  situation  is  symmetrical,  and  we  arbitrarily  choose  the  latter  alternative.  By  reversing 
Algorithm  4.3  and  introducing  a  boolean  array  (as  we  did  in  Algorithm  2.2),  we  obtain 

Strong(V)  <=  begin  visit 2(V) «-  false;  Urev  ('f  visit2[r)  then  0  else  (scr(r,r)})  end 
scr{u,  r)  «= 
bf<tin 

sif2(u)  ♦-  true; 

t"}  u  Uwgyw,- 1  („)(  if  ~>visit2{w)  A  path(r,  w)  then  scr[w,  r)  ) 

end 


(4.4) 


(We  cal!  the  boolean  array  visit2  to  avoid  name  conflict  with  the  boolean  array  used  in  the  forward 
depth-first  search.) 

A  Blind  Alley.  It  may  appear  that  we  could  obtain  an  acceptable  implementation  of  Algorithm 
4.4  by  replacing  path(r,w)  with  the  test  w  €  dfs[r)  and  factoring  the  dfs  calculation  out  of  scr  into 

strong. 

strong(V)  <=  begin 

visit2(V)  *-  false; 

Urev  (if  visit2(r)  then  0  else  {scr'(r,r,c//s(r))}) 

end 

scr^Uif.-D)  *=  (4.5) 

begin 

visit2{u)  *-  true; 

(u)  U  UweAdf'  <i*)(  ^v‘Sit2(w)  A  w  e  D  then  sc/(w,  r,  D) ) 

end 

Unfortunately,  the  set  of  roots  required  for  the  reverse- search  forest  is  not  necessarily  the  same 
as  that  required  for  forward  search,  and  so  the  dfs(r)  calculation  in  strong  could  do  redundant 
traversals. 

Testing  Ancestry.  Rather  than  solving  this  problem  directly,  we  choose  instead  to  consider 
more  subtle  methods  for  realizing  the  path  test.  In  particular,  we  follow  the  specialization  methodol¬ 
ogy  and  make  use  of  facts  about  the  context  in  which  the  path  test  occurs  in  order  to  produce  a 
specialized  program  for  that  test. 

As  in  the  case  of  biconnected  component  detection,  we  will  need  to  state  several  structural 
lemmas.  These  lemmas  will  suggest  methods  by  which  the  specialization  can  be  achieved  and 
hence  by  which  strongly  connected  components  can  be  detected  quickly.  Again  as  in  the  case  of 
biconnectivity,  the  lemmas  refer  to  depth-first  search  forests  and  the  descendent  relations  they 
induce  over  vertices. 

forest(V)  «=  begin  Ws/f(V) «-  false;  for  r  £  V (if  ->visit(r)  then  dts(r))  end 

dts{  u)  «= 
begin 

visit(u) «-  true;  (*-®) 

for  w  €  Adj(v)  do 

if  ->visit(w)  then  as»ert[w  y  v  A  u  -*  wjdfs(tu) 

end 

(As  before,  we  will  let  *>-’  stand  for  its  own  transitive  closure;  thus,  if  u  and  v  are  vertices,  then 
u  y  v  if  u  is  a  proper  descendent  of  v.) 

Assume  that  a  depth-first  search  has  been  performed  on  a  directed  graph  to  obtain  a  depth-first 
search  forest. 

LEMMA  4.1.  For  each  strongly  connected  component  S  there  is  a  unique  vertex  r,  called  the  root 
of  S,  such  that  r  =  minv(5). 

It  follows  from  this  lemma  that  if  r  is  the  root  of  a  strongly  connected  component  and  there 
is  a  path  from  w  to  r  then 

path(r,w)  if  and  only  if 


w  y  r 


This  fact  will  enable  us  to  replace  the  path  test  in  Algorithm  4.4  with  a  test  of  the  desccndcnt 
relation.  Recall  from  Section  2  that  the  descendent  relation  can  be  computed  efficiently  using  pre- 
and  endorder  numberings.  (We  do  this  in  Algorithm  4.10  below.) 

We  will,  however,  have  to  modify  the  strong  program  so  r  ranges  only  over  the  strongly- 
connected-component  roots,  otherwise  our  replacement  would  be  invalid.  A  further  lemma  below 
will  enable  us  to  obtain  the  roots  easily. 


strong(R)  «=  begin  visit2(V) «-  false;  Ure/i  (sc,(r>r)}  end 
scr(ti,  r)  «= 

begin 

visit2(u)  *-  true; 

{«}  U  *  {«)(  ^  visit2(w)  Aw  >  r  then  scr(w,  r) ) 

end 


(4.7) 


We  assume  here  that  R  is  exactly  the  set  of  strongly-connected-component  roots.  Since  the 
components  are  disjoint  and  since  by  Lemma  4.1  each  has  exactly  one  root,  strong(R)  will  yield 
all  the  strongly  connected  components  in  the  graph. 


Finding  Roots.  We  must  now  consider  the  problem  of  efficiently  locating  the  roots.  Observe 
that  it  follows  from  the  lemma  above  that  every  root  in  the  depth-first  search  forest  is  the  root  of 
a  strongly  connected  component.  This  suggests  that  we  should  modify  forest  to  collect  the  roots 
of  the  depth-first  search  trees  as  they  are  explored. 

1orest(V)  <=  begin  visit(V)  <-  false;  (JreK(if  ->ws;f(r)  then  dfs[r);  {r}  else  0)  end  (4.8) 


Finding  More  Roots.  Algorithm  4.8  collects  only  a  subset  of  the  component  roots.  The 
following  lemma  suggests  an  approach  for  locating  the  remainder  of  the  roots. 

LEMMA  4.2.  Let  5  be  a  strongly  connected  component.  For  each  v  £  S  and  each  w  % S  such  that 
v  -*  w,  w  is  the  root  of  a  strongly  connected  component. 

Since  all  vertices  are  included  in  the  depth-first  forest,  the  roots  specified  by  this  lemma 
together  with  the  roots  collected  by  forest  comprise  the  set  of  all  component  roots.  The  following 
program  returns  a  set  of  the  new  strongly  connected  component  roots  at  the  outgoing  edges  of  a 
given  component.  Let  r  be  the  root  of  component  S. 

update(r,S)  <= 

Uves  (UweAdXv)  (if  _’WS/f2(tu)  A  V  w  then  {w}  else  0)  )  V4  *' 

This  program  essentially  implements  the  lemma,  with  the  additional  optimisation  of  examining 
visit2  to  avoid  collecting  redundant  roots.  (We  omit  the  derivation  of  this  optimization.)  Also,  note 
that  it  is  possible  that  the  representation  of  5  might  make  the  outer  union  difficult  to  compute; 
though  we  do  not  do  it  here,  it  would  then  be  advantageous  to  use  the  observation  that  the 
descendcnts  of  a  strongly  connected  component  can  be  found  by  depth-first  search  and  do  the 
search  instead. 
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The  Algorithm.  We  have  now  resolved  the  strongly  connected  component  algorithm  into  two 
phases.  First,  forest  is  used  to  collect  depth-first  search  forest  roots  and  to  precompute  the  pre- 
and  endordcr  numbering  used  for  testing  ancestry.  Second,  strong  is  used  to  do  reverse  depth-first 
searches  from  these  roots,  collecting  strongly  connected  components  along  the  way.  The  update 
procedure  is  used  to  collect  new  roots  as  strongly  connected  components  are  found. 

var  p,  e,  pre(V),  end(V),  visit2; 

torest(V)  <= 
begin 

Pre(Y)  *—  o* 
p*-e*~  0; 

Urev(*f  PTeir)  —  0  then  dfs(r);  {r}  else  0)  end 

dts(u)  *= 
begin 

pre(u)  <- p  —  p+  1; 

value  (  {it}  U  Pre(w)  —  0  then  a*»ert[u  — *  tu);  dfs(w,  u)  else  0) ); 

tnd(u )  1 

end 

strong(R)  *= 

begin  var  r,  S;  (4-10) 

choose  r€fi; 

S  «-  scr(r,r); 
aaaert(Componenf(S)); 
strong(update(r,  S)  U  -R  —  {r}) 
end 

scr(u,  r)  «= 
begin 

visit2[v)  *-  true; 

{«}  U  UveAdj-  1  if  -'Visit2{w )  A  pre(w)  >  pr«(r)  A  end(w)  <  end(r) 

then  scr(w,  r )  ) 

end 

upclate(r,  S )  <= 

U«es  (  Uvi€Adj(v)  0r  -,ws/f2(w)  A  t)-*w  then  {w}  else  0)  ) 

Of  course,  the  program  derivation  process  has  no  definite  termination  criteria.  We  could  continue 
improving  this  algorithm  by  realizing  the  various  implicit  loops,  by  frequency  reduction  (e.g.,  for 
prc(r)  calculation),  by  eliminating  set  operations  (e.g.,  in  strong ),  and  in  many  other  ways.  We 
conclude  at  this  point,  however,  since  the  structure  of  the  linear-time  algorithm  is  now  most  clearly 
apparent  and  since  the  next  set  of  derivation  steps  fall  within  the  range  of  established  techniques. 

S.  Conclusions. 

This  work  is  a  step  towards  developing  a  new  paradigm  for  the  presentation  and  explication 
of  complex  algorithms  and  programs.  It  seems  to  us  insufficient  to  simply  provide  a  program  or 
algorithm  in  final  form  only.  Even  with  “adequate”  documentation  and  proof,  the  final  code  cannot 
be  as  revealing  to  the  intuition  as  a  derivation  of  that  code  from  initial  specifications. 

Ideally,  a  programming  environment  should  support  the  programmer  in  the  process  of  building 
derivations. 
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In  a  specific  problem  domain,  such  as  graph  algorithms,  certain  facts  and  fundamental  algo¬ 
rithms  should  be  available  for  access.  The  value  of  this  store  of  facts  should  not  be  underestimated. 
In  our  derivations,  for  example,  certain  algorithms  were  repeatedly  used  as  paradigms  for  the  devel¬ 
opment  of  other  algorithms.  This  kind  of  analogical  development  is  similar  in  heuristic  content  to 
the  goal-directed  transformation  of  algorithms  required  to  carry  out  the  loop  merging  optimization 
or  in  order  to  create  recursive  calls  during  specialization. 

We  are  still  very  far  from  automating  the  heuristic  side  of  the  derivation  process.  In  fact,  we 
argue  that  at  this  point  our  efforts  are  better  directed  at  discovering  and  exercising  useful  transfor¬ 
mations,  developing  foundations  for  proving  their  correctness,  and  developing  tools  for  interactive 
program  development  that  can  make  appropriate  use  of  outside  domain-specific  knowledge.  For 
example,  it  appears  that  once  the  necessary  outside  lemmas  are  stated  and  proved,  only  a  modest 
deduction  capability  would  required  in  such  a  programming  environment;  it  would  be  used  mainly 
to  establish  'preconditions  for  transformations  and  application  of  lemmas. 

Finally,  by  storing  program  derivations  as  data  structures  in  a  program  development  system, 
program  modifications  can  be  carried  out  simply  by  making  changes  at  the  appropriate  places  in 
the  derivation  structure;  if  only  the  final  code  is  available,  the  conceptual  history  of  the  program 
must,  in  effect,  be  rediscovered. 
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